Space Filling Curves:

What is a Space Filling Curve?

A Space-Filling Curve (SFC) is a special kind of curve that passes through every point in a grid or space, visiting them in a specific order. Think of it like a clever way to turn 2D (or higher) data into 1D, while preserving spatial closeness as much as possible. -How It Works The curve visits all grid cells in a recursive, structured order. The most common SFCs are: Z-order (Morton Order): Interleaves binary bits of coordinates (easy to compute, good locality) Hilbert Curve: More complex but keeps neighbors very close in 1D Peano Curve: Another variant that fills in a more dense path The grid is divided recursively, and the curve visits each small cell in a particular pattern. -Where It's Used Spatial indexing (especially in R-Trees, databases like PostgreSQL/SpatiaLite) Image compression and memory layout Big data systems (e.g., Apache HBase uses Z-ordering) Physics simulations and cache-efficient computation -Strengths Preserves spatial locality: nearby points stay close in 1D order Great for indexing and sorting spatial data Works well for range queries and KNN when paired with other data structures -Weaknesses Doesn’t preserve all spatial relationships (far points may still appear close) Harder to understand and visualize than trees More complex to implement for high dimensions -Example If you have points on a 4×4 grid, a Z-order curve will visit them in a zigzag-like binary bit pattern: (0,0) → (1,0) → (0,1) → (1,1) ... By mapping each point to a single Z-value, you can sort or search more easily.

Space-Filling Curve Operations

Construction:

  1. Assign curve keys (e.g. Morton code) to all data points
  2. Sort data by key to impose spatial order

Insertion:

  1. Compute key for new point using curve mapping
  2. Insert in sorted position to preserve order

Search:

  1. Range Query: Convert region to key range(s)
  2. Nearest Neighbor: Scan neighbors along curve order

Deletion:

  1. Locate key in sorted list
  2. Remove entry and shift data if needed

Complexities

Operation Average Case Worst Case
Bulk LoadingO(n log n)O(n log n)
InsertionO(log n)O(logn)
DeletionO(log n)O(logn)
Range QueryO(log n + m)O(n)
k-NN QueryO(k log n)O(kn)

Insertion:

When a new point is added, its Z-value (or Hilbert value) is computed to determine its 1D position on the curve. The point is then inserted into a sorted structure like a balanced tree or array. This lookup and insert step is done in O(log n).

Deletion:

The point to be removed is located by its Z-value and deleted from the sorted list. Since the data is stored in order, binary search allows fast access and removal, also in O(log n).

k-NN Search:

After locating the query point’s position on the curve, its neighboring entries are scanned. The k closest points (by real Euclidean distance) are selected. Using curves like Hilbert helps preserve spatial proximity better, improving efficiency. The complexity is O(k log n).

Range Query:

The multidimensional query box is transformed into corresponding 1D intervals along the curve. These intervals are scanned in the sorted structure. Each candidate point is checked to see if it truly lies within the original multidimensional range. Complexity is O(log n + m), where m is the number of results.

Node Capacity [max = 8] =